Collecting Spontaneously Spoken Queries for Information Retrieval
نویسندگان
چکیده
Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took place two experiments of collecting the speech data by our method using publicly available test collections of evaluating document retrieval. The first preliminary experiment took place with relatively small number of search topics selected from the NTCIR-3Web retrieval collection, which had been constructed for the TREC-style evaluation workshop, in order to test our method. The second experiment took place with all of the search topics released from the NTCIR-4 Web task to participate the formal run of the evaluation. The information about the collected data and the result of the evaluation with respect to both the speech recognition accuracy and the precision of document retrieval by using the collected data are presented in this paper.
منابع مشابه
Experiments on Web Retrieval Driven by Spontaneously Spoken Queries
Motivated to realize the speech-driven information retrieval systems that accept spontaneously spoken queries, we developed a method to collect such speech data derived from the pre-defined search topics that had been systematically constructed for IR research. In order to evaluate both our method and the performance of the document retrieval by using the spontaneously spoken queries, we took p...
متن کاملComparing Isolately Spoken Keywords Queries for Japanese Spoken D
This paper describes a Japanese spoken document retrieval system that uses voice input queries. We prepare two types of spoken queries: isolately spoken keywords and spontaneously spoken queries. To solve a mis-recognition problem of spoken queries, N-best hypotheses of transcripts of queries are used, and keyword candidates are selected from them by mutual information between recognized words....
متن کاملIncorporating Acoustic Features for Spontaneous Speech Driven Content Retrieval
A speech-driven information retrieval system is expected to be useful for gathering information with greater ease. In a conventional system, users have to decide on the contents of their utterance before speaking, which takes quite a long time when their request is complicated. To overcome that problem, it is required for the retrieval system to handle a spontaneously spoken query directly. In ...
متن کاملComparison of different phone-based spoken document retrieval methods with text and spoken queries
This study compares four phone-based spoken document retrieval (SDR) approaches. In all cases, the indexing and retrieval system uses phonetic information only. The first retrieval method is based on the vector space model, using phone 3-grams as indexing terms. This approach is compared with 2 string-matching methods. A fourth method, combining the VSM approach with the slot detection step of ...
متن کاملCross-Language Image Retrieval via Spoken Query
This paper studies cross-language cross-medium information retrieval. We introduce several approaches to unify the languages and media of queries and documents. We experiment on cross-language image retrieval via spoken query. Two approaches are proposed to recognize and translate spoken queries. We also propose a similarity-based approach to identify and backward transliterate named entities i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004